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ABSTRACT 

Stellar activity may induce Doppler variability at the level of a few m/s which can then be 
confused by the Doppler signal of an exoplanet orbiting the star. To first order, linear correlations 
between radial velocity measurements and activity indices have been proposed to account for any 
such correlation. The l i kely p resence of two super-Earths orbiting Kapt eyn’s star was reported in 
Anglada-Escude et al.l ( 2014I) . but this claim was recently challenged by Robertson et al. ( 2015b ) 
arguing evidence of a rotation period (143 days) at three times the orbital period of one of the 
proposed planets (Kapteyn’s b, P=48.6 days), and the existence of strong linear correlations 
between its Doppler signal and activity data. By re-analyzing the data using global optimization 
methods and model comparison, we show that such claim is incorrect given that; 1) the choice 
of a rotation period at 143 days is unjustified, and 2) the presence of linear correlations is not 
supported by the data. We conclude that the radial velocity signals of Kapteyn’s star remain more 
simply explained by the presence of two super-Earth candidates orbiting it. We also advocate 
for the use of global optimization procedures and objective arguments, instead of claims lacking 
of a minimal statistical support. 
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1. Introduction 


Recently, the search for low-amplitude sig¬ 
nals in radial velocity time-series has reached 
the point where detection of Doppler signals at 


the 

ble 


level of Im/s 


jPepe et al.l 12011 


m/s 

] fe 


or less is technically possi- 
Tuomi fc Anglada-Escudd 


20131) . Along with this rise in precision have 
come claims, and counter-claims, of the detec- 


mass planets (e.g. a Centauri, Dumusaue et al. 

2012 


HatzesI 12013; HD 41248 Jenkins et al. 

2013 

: Jenkins & Tuomi 2014, Santos et al. 20l4 
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GJ 581 Mavor et al. 2009 

. Robertson et al. 2014, 

Anerlada-Escude & Tuomi 

2015 

). Given the sensi- 


tive nature of these works, it is clear more work 
must be done to develop a clear structure for what 
constitutes a Doppler signal detection and what 
does not. 

It is known that stellar activity might induce 
spuriou s signals in precis ion Doppler measure¬ 
ments I Queloz et al. 200 iL eg.). In particular, 


variability in chromospheric activity indices are 
supposed to originate from localized active regions 
on stars. Changes in the local properties of the vis¬ 
ible surface of stars can induce apparent Doppler 
shifts that do not necessarily average out over 
time, producing appar ent signals that might be 
mista ken as planets leg. lHatz'^12002 : Bonfils et al. 
2007t) . Theoretical and numerical simulations 


suggest that variability on some of these indices 
should linearl y correlate with apparent radial ve- 
locity sh ifts (Boisse et al.l 12011 JPumusaue et al 


201 


ty ! 

i. 


Robertson et al.l (|2014ll exploited this 


expected linear correlation to propose that the 
planet candidate GJ 581d was caused by stellar 
variability by showing some correlations of activity 
indices with residual time-series (all other signals 
removed). Since residual time-series are not rep¬ 
resentative of the origi nal data, such conclusions 


were challenged b v lAngl ada-Esc ude fc Tuomi 


( 2015l l. In response, Robertson et al. ~ (|2ni5a l ad¬ 


mitted inconsistencies in their statistical analy¬ 
sis but claimed that their interpretation of the 
data was physicallY more sound. Along these 
lines, in Robertson & MahadevanI ( 2014h and 
Robertson et al. ( 2015bh 1RM15 hereafter 1 sim¬ 
ilar qualitative arguments were provided to argue 
that several super-Earth mass planet candidates 
orbiting nearby M-dwarf stars were likely to be 
sp urious. In this paper we show that the claims 
in iRobertson et H) (2015b) are unsupported by a 
global fit to the data, so such results should be 
regarded as inconclusive. 

The data used in this paper comes directly from 
RM15 to replicate their setup as closely as pos¬ 
sible. The datasets in RM15 contain measure¬ 
ments obtained with the HARPS and the HIRES 
sp ectrometers. These are different from the ones 


Anglada-Escude et al.l ( 2014 1 in the sense that 


RM15 includes additional spectroscopic indices 
and, additionally, three HARPS epochs (out of 
95) were removed. We also include the analysis 


of V magnitude historical photometri c measure¬ 
ment s obtained by the ASAS project (jPoimanski 
19971). A more detailed description of the mea¬ 
surements are given in both papers and references 
therein. We start by reviewing possible periodic 
signals in the activity indices presented by RM15 
in Section [2j Section o introduces a minimal 
Doppler model to include linear correlation terms 
caused by activity. To remove ambiguities about 
the framework used, we perform the analyses in 
a frequentist (Section 13.2[) and a Bayesian frame¬ 
work (Section 13.311 : both providing a consistent 
picture of no correlations in either case. Section |4] 
discusses the discrepancy between our results and 
the analysis presented in RM15. A summary and 
concluding remarks are given in Section [5l 

2. Possible signals in activity indices and 
ASAS photometry 

We perform a likelihood periodogram analysis 
of the activity indices as provided by RM15 to 
verify the claim of a clear rotation period at 143 
days. Likelihood ratio periodograms solves for all 
the free parameters of the model at the same time 
when a signal is injected over a list of trial periods 
(x-axis). Such periodograms ar e a generalizat ion 
of Lomb-Scargle periodograms ( Scarglel 1982 1 to 
account for models m ore complex than a single si¬ 
nusoid ( Baluev![ 2 OO 9 I I. including parameters of the 
noise model (eg. extra white noise for the activity 
data). The signal producing the highest improve¬ 
ment of the maximum log-likelihood statistic (y- 
axis) would be the preferred one and its signifi¬ 
cance ca n then be assessed us ing the recipes intro¬ 
duced bv iBahiev ( 20n9l . 20131 1. producing analytic 
estimates of the false alarm probability of detec¬ 
tion (or FAR). As a general rule, signals above a 
FAR threshold of 1 % can be considered signifi¬ 
cant, but a more conservative threshold of 0.1% is 
sometimes used. We present both in all the pe¬ 
riodograms presented throughout the paper. In 
the case of activity data, we assume that the sig¬ 
nal is modelled by: one constant (equivalent to 
the mean of the time-series), one sinusoid (phase 
and amplitude are free parameters), and an ex¬ 
tra white noise parameters added in quadrature 
to the nominal uncertainties of each measurement. 
As mentioned by RM15, nights with several mea¬ 
surements might be overweighted and bias the sig¬ 
nal searches. To account for this, we present the 
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Fig. 1.— Likelihood periodograms of the activ¬ 
ity indices in RM15 (from top to bottom; BIS, 
FWHM, /q,, Na D, S-index) and AS AS V band 
photometry (bottom). With the exception of BIS, 
variability above 1% FAP threshold (horizontal 
line) is detected in all indices. Most relevant possi¬ 
ble periods in each activity index are flagged with 
arrows. The similarities between periodograms in 
different indices (long period trend, and possible 
signals between 80 and 300 days) suggest simi¬ 
lar, non-strictly periodic stellar variability in these 
time-ranges but does not point out to a clearly 
preferred signal. 


analysis using night averages only (45 independent 
epochs). Our conclusions however didn’t differ 
substantially if all datapoints were included. 

The activity indices provided in RM15 include 
BIS, FWHM, Iq, Na D, S-index. The first two 
are measurements of the shape of the mean spec¬ 
tral line (BIS and FWHM represent asymmetry 
and width respectively), which can potentially 
trace activity-induced features on the stellar pho¬ 
tosphere. The last three ones are measurements 
of the chromospheric emission of the star at the 
Hq (Iq), Sodium Di and D 2 lines (Na D), and 
Calcium H-l-K lines (S-index). Chromospheric in¬ 
dices are also supposed to trace the presence of ac¬ 
tive regions on the star that might be responsible 
for apparent Doppler shifts. More precise defini¬ 
tions and possible connection to activity-induced 
signals are given in RM15 and references therein. 
The results of signal searches on the five indices 
used by RM15 (plus available V band photometry 
from the AS AS survey) are summarized in Figure 

HI 


No significant periodicity is detectable in BIS. 
Several other indices show multiple peaks above 
the 1% and 0.1% FAP thresholds (horizontal 
dashed and solid lines, respectively). However, 
several of the peaks have similar A In-L val¬ 
ues, meaning that they satisfy the data similarly 
well. The only exception is the long period trend 
(marked as 5000-1- days in Fig[T]), which in some 
cases produces a much larger improvement of the 
likelihood (eg. FWHM and /«; second and third 
panels from the top, respectively). Although the 
periodograms in RM15 also show a likely long pe¬ 
riod trend in several indices, this evidence was 
disregarded as irrelevant in RM15 by using gen- 
eralistic arguments that are not supported by the 
literature. That is, most stars in the M-dwarf sub¬ 
sample of the HARPS-GTO program (Kapteyn’s 
star is part of it) were found to show chromo¬ 
spheric varia bility in similar indic e s ove r long 
time-scales bv iGomes da Silva et al. ( 2012 ). 


In summary, signals at 5000-I-, 1100, 270, 135 
and 88 days would explain the activity data 
equally well (even better depending on the in¬ 
dex). Given this ambiguity the preferred periods 
in the various activity indices, the choice made 
by RM15 for a rotation period at 143 days seems 
rather arbitrary. 
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3. Search for correlations in the Doppler 
data 

3.1. Model 

The next step in RM15’s analysis was to as¬ 
sess the significances of linear correlations of the 
Doppler signals with the activity indices. We im¬ 
plement linear correlations by adding a linear rela¬ 
tionship between the radial velocities and activity 
data by using the following model 


v{t) = M{9,t)+ '^c^Ii, (1) 

i 


where M contains all the Doppler variability 
modeled by Keplerian signals, and 9 lists the 
usual parameters used in RV modelling (see 
Tuomi fc Anglada-Escud^l2013L as an example). 
Activity measurements obtained simultaneous to 
v(t) are h, where i is added over all the activity 
indices under consideration. As discussed before, 
these indices include i=BIS, FWHM, !„, Na D, 
S-index. 


Given a model, one can search for the combina¬ 
tion of parameters that optimize a figure of merit 
(global optimization), and then decide whether 
the inclusion of a correlation term or a planet is 
warranted given the improvement of the reference 
statistic. As long as global optimization is applied 
(all parameters adjusted simultaneously), there 
are various ways to assess significance of planetary 
signals or correlations using either Bayesian or 
frequ entist approaches ( Anglada-Escude fc Tuonril 
I2OI2II . A Bayesian approach consists of assess¬ 
ing which model has the highest probability given 
the data. Frequentist confidence tests evaluate 
the chances of obtaining an improvement of a 
statistic by an unfortunate combination of ran¬ 
dom errors. While RM15 show some apparent 
correlations when representing one Doppler signal 
against some of their activity data, the significance 
of those correlations was never established using 
model comparison. The next two sections show 
that the correlations claimed in RM15 are not sig¬ 
nificant when a global fit to the data is obtained 
in either framework. 


3.2. Frequentist analysis 

In RM15, the strongest apparent correlation 
was reported to be in the chromospheric flux as 
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Fig. 2.— Likelihood-ratio periodograms for 
first (top, Kapteyn’s c, k=l planet) and second 
Doppler signals (bottom, Kapteyn’s b, k=2 plan¬ 
ets), without linear correlations (gray) and includ¬ 
ing linear correlations with the la index (con¬ 
nected black dots). The peaks for the Doppler 
signals remain above the 1% and 0.1% FAP thresh¬ 
olds in both cases. 


measured by their la index. In Fig. [2] we present 
likelihood ratio periodograms of the combined 
HARPS and HIRES data (each data-set has its 
own linear correlation coefficient as a free param¬ 
eter). As shown in Fig. the significance of both 
signals (120 and 48.6 days) remain well above the 
0.1% FAP threshold, even when linear correlations 
are included in the model. If linear correlations 
could explain the data better, adding a Keplerian 
signal would not improve the fit substantially and 
its peak would be suppressed below threshold. A 
similar result is obtained by using the other activ¬ 
ity indices from RM15 (omited here for brevity). 
In summary, the likelihood analysis indicates that 
the linear correlation model cannot account for the 
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presence either Doppler signals. 


3.3. Bayesian analysis 

In this section we perform a Bayesian analysis 
to evaluate the significance of correlations of the 
RV data with activity indices again assuming the 
linear model in Eq. [T] As before, we literally use 
the values provided in RM15 for simplicity in the 
discussion. All linear correlation terms (ci corre¬ 
sponds to HARPS BIS; C 2 to HARPS FWHM; C 3 
to HARPS la; C 4 to HARPS Na D, and C 5 to the 
HARPS S-index) were tested at the same time by 
simultaneously including them all as free parame¬ 
ters. As a figure of merit for model comparison, we 
obtained the integrated likelihoods of models with 
and without signals and linear correlation terms. 
These integrated likelihoods (sometimes called Ev¬ 
idences E) wer e calculated by setting th e prior s 
as discussed in Tuomi fc Anglada-Escudd ( 2013 1. 
and uniform ones for the parameters c^. The al¬ 
gorithm used for the estimation of the integral 
is based on a mixture of Markov Chain Monte 
Carlo samples from both the posterior and prior 
( Newton et al.lflQ^ . 


Fig. 131 illustrates the posterior densities of each 
correlation coefficient Ci against the K semi¬ 
amplitudes of the signals at 48.6 (Kapteyn’s b) 
and 120 days (Kapteyn’s c). The posterior den¬ 
sities were sampled using the adaptive-Metropoli s 
posterior sampling algorithm (iHaario et al. 2001 ). 
Two features would be expected for a radial veloc¬ 
ity variations signal traced by an activity index. 
Firstly, the posterior densities in Fig.[3]would show 
a tilted elliptical shape and the value of the corre¬ 
sponding Ci would be significantly different from 0 , 
and secondly, K would be consistent with 0 in the 
sense that 95% (or 99%) equiprobability contours 
overlapped with zero. Some of the plots show 
some mild hints of correlation (tilted ellipses), but 
all distributions for the ci are broadly consistent 
with 0 values. In contrast, the expected value for 
the semiamplitudes of Kapteyn b is distinct from 
0 at a 5cr level (even higher for Kapteyn’s c), 
where a is the standard deviation in of the pos¬ 
terior density in each K (see Fig [3]). The reason 
for the apparent contradiction with the claims in 
RM15 is explained in the next section. 

Table m summarizes the model probabilities 
with linear correlations and planet signals in¬ 
cluded. The evidence ratios between models with 


k and k — 1 signals remain well above any reason¬ 
able significance threshold (eg. model probabil¬ 
ities larger than the 150-1000 factors usually re¬ 
quired to claim a confident detection). The models 
including linear correlations (right) have slightly 
better integrated probabilities than those with¬ 
out (left), but the improvement is only a factor 
of 12 when comparing the models with k = 2. 
This negligible level of significance of correlated 
variability is again consistent with the confidence 
level contours of Fig. [3l which imply that all Cj 
are compatible with 0 . 

4. Origin of the correlation proposed by 
RM15 



Fig. 4.— Correlation between the index and the 
RVs once all signals except Kapteyn’s b have been 
removed from the data. The thin violet line is the 
maximum likelihood fit to the data we obtained, 
and the thick violet lines represent alternative fits 
within la values of the obtained correlation coef¬ 
ficient. The fit proposed by RM15 is represented 
by a red line and the Icr representations of their 
law are illustrated as dotted red lines. 

There is a fundamental difference in the proce¬ 
dure we have used here to assess the presence of 
correlations and the one used by RM15. That is, 
while we used a global fit to the data to constrain 
the coefficients, RM15 used the predictions of the 
two planet model (with no errors) to perform their 
analysis. That is, RM15’s Figure 3 (top-central 
panel) shows against the Doppler model of 
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Fig. 3.— Posterior densities and equiprobability contours of the semi-amplitudes of the planet candidates 
(top) and Kb (bottom) against the linear correlation terms defined in the text (x-axis). The contours contain 
50%, 95%, and 99% of the probability density, respectively. The 3cr and 5cr intervals of the distributions are 
shown for and Kc to demonstrate how significantly Kb and K^ differ from 0. On the other hand, all Ci 
are found to be broadly consistent with 0. 


Table 1 

Natural logarithms of the integrated model probabilties E and their ratios. 


Number of Planets 
k 

Keplerian only 

InEk HEk/Ek-i) 

Keplerian + correlations 
InEk HEk/Ek-i) 

0 

-277.7 

_ 

-273.6 

_ 

1 

-260.1 

+17.6 

-254.9 

+18.7 

2 

-238.8 

+21.3 

-241.3 

+13.6+ 


Note. —lAs a reference, a of -1-13.6 indicates that the model 

with k planets has a higher probability than a model with k — 1 planets by 
a factor ® = 8.1 x 10^. 


planet b. In our Figure SJ we show the same plot 
but present the radial velocity measurements after 
removing all signals except planet b. The linear 
correlation law derived from our Bayesian analy¬ 
sis in the previous section is presented in violet. 
Models showing allowed values of the correlation 
coefficients at ± Ict intervals are also represented 
as thick violet lines, which visually illustrates the 
large uncertainty in those. The best correlation 
law proposed by RM15 is shown as a red line, and 
red dotted lines show values of the coefficient at 
their reported ± Icr values. While the linear cor¬ 
relation law reported by RM15 is well within our 
Ict interval, their reported uncertainties are noto¬ 
riously underestimated producing the spurious ar¬ 
tifact of significant correlation. This is a direct 
consequence of misusing the RV model produc¬ 
tions (no uncertainties), instead of the actual data 
on testing the existence of potential correlation 
laws. We note for example, that even the Doppler 


model contains uncertainties, which where ignored 
in RM15. 

5. Discussion 

We have shown that linear correlations of RVs 
with activity indicators in the currently existing 
data are insignificant for Kapteyn’s star’s RVs 
when a global fit to the data is obtained. This 
stands in contrast to the claims made in RM15, 
which were based on a number of approximate 
physical assumptions and the implementation of 
ad hoc procedures. We also want to stress that 
interpretation of the 143d periodicity found by 
RM15 in several indicators as rotation period 
seems premature: alternative periods of 88d, 135d 
or 270d are similarly likely, and long-term activ¬ 
ity trends cannot be ruled out either. Even If for 
the moment we assume that the star rotates at 
a period of 143d, it is not straightforward to use 
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this as argument against a Doppler signal close- 
to Prot/3, because there is no activity signal at 
Prot/2 or Pr-ot/S. Given all these caveats, we con¬ 
sider that the current Doppler data of Kapteyn’s 
star is most easily explain ed by the presence of 


two p lanets as proposed in lAnglada-Escude et al 


( 20141 1 rather than activity induced variability as 
proposed by RM15. 


A clear distinction must be made between the 
statistical significance of RV signals and the phys¬ 
ical presence of planets (together with the merit of 
their detection or falsification). We advocate for 
comprehensive scientific discussions about the for¬ 
mer instead of running into premature and unsup¬ 
ported statements about the latter. We conclude 
by emphasizing that the intention of this paper is 
not to rescue the planetary status of Kapteyn’s b 
or any other planet detection, but to stress the 
importance of objective global analysis techniques 
in serious scientific discussions. 
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